A Robust Document Image Binarization Technique for Degraded Document Images

نویسندگان

  • Bolan Su
  • Shijian Lu
چکیده

Segmentation of text from badly degraded document images is a very challenging task due to the high inter/intravariation between the document background and the foreground text of different document images. In this paper, we propose a novel document image binarization technique that addresses these issues by using adaptive image contrast. The adaptive image contrast is a combination of the local image contrast and the local image gradient that is tolerant to text and background variation caused by different types of document degradations. In the proposed technique, an adaptive contrast map is first constructed for an input degraded document image. The contrast map is then binarized and combined with Canny’s edge map to identify the text stroke edge pixels. The document text is further segmented by a local threshold that is estimated based on the intensities of detected text stroke edge pixels within a local window. The proposed method is simple, robust and involves minimum parameter tuning. It has been tested on three public datasets that were used in the recent Document Image Binarization Contest (DIBCO) 2009 & 2011 and Handwritten Document Image Binarization Contest (H-DIBCO) 2010 and achieves accuracies of 93.5%, 87.8% and 92.03%, respectively, that are significantly higher than or close to that of the bestperforming methods reported in the three contests. Experiments on the Bickley diary dataset that consists of several challenging bad quality document images also show the superior performance of our proposed method, compared with other techniques.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Binarization of Document Image

Documents Image Binarization is performed in the preprocessing stage for document analysis and it aims to segment the foreground text from the document background. A fast and accurate document image binarization technique is important for the ensuing document image processing tasks such as optical character recognition (OCR). Though document image binarization has been studied for many years, t...

متن کامل

Robust Document Image Binarization Technique for Degraded Document Images by using Morphological Operators

To make a robust document images from badly degraded images it is necessary to discriminating a text from background images but it is a very challenging task. There are so many binarization techniques can be used for making the document pictures reliable. But problem of thresholding and filtering cannot be solved. In the existing method, edge based segmentation can be done and Canny edge detect...

متن کامل

Document Image Binarization Technique for Degraded Document Images

Document image binarization is a vital pre-processing technique for document image analysis that segments text from badly degraded document images. In this paper, we propose a robust document image binarization technique that is based on the concept of adaptive image contrast. The adaptive image contrast which is formed by combining local image contrast and the local image gradient makes it tol...

متن کامل

Ancient Document Images Enhancement Using Phase Based Binarization

In this paper, we present a phase-based binarization model for degraded document images, also a post processing method that can improve any binarization method and a ground truth generation tool. Usually, many binarization techniques are implemented in the literature for different types of binarization problems. It include an adaptive image contrast based document image binarization technique t...

متن کامل

A Survey on Degraded Document Image Binarization Techniques

the method of segmentation in the image binarization technique is the major technique used for the separation of pixel values into dual collections, black as foreground and white as background. The degraded images of a document are segmented by using the image binarization technique in order to acquire the clear images exact to that of the original images of documents. Thresholding process is t...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012